63 research outputs found

    A Knowledge-Based Topic Modeling Approach for Automatic Topic Labeling

    Get PDF
    Probabilistic topic models, which aim to discover latent topics in text corpora define each document as a multinomial distributions over topics and each topic as a multinomial distributions over words. Although, humans can infer a proper label for each topic by looking at top representative words of the topic but, it is not applicable for machines. Automatic Topic Labeling techniques try to address the problem. The ultimate goal of topic labeling techniques are to assign interpretable labels for the learned topics. In this paper, we are taking concepts of ontology into consideration instead of words alone to improve the quality of generated labels for each topic. Our work is different in comparison with the previous efforts in this area, where topics are usually represented with a batch of selected words from topics. We have highlighted some aspects of our approach including: 1) we have incorporated ontology concepts with statistical topic modeling in a unified framework, where each topic is a multinomial probability distribution over the concepts and each concept is represented as a distribution over words; and 2) a topic labeling model according to the meaning of the concepts of the ontology included in the learned topics. The best topic labels are selected with respect to the semantic similarity of the concepts and their ontological categorizations. We demonstrate the effectiveness of considering ontological concepts as richer aspects between topics and words by comprehensive experiments on two different data sets. In another word, representing topics via ontological concepts shows an effective way for generating descriptive and representative labels for the discovered topics

    Text Summarization Techniques: A Brief Survey

    Get PDF
    In recent years, there has been a explosion in the amount of text data from a variety of sources. This volume of text is an invaluable source of information and knowledge which needs to be effectively summarized to be useful. In this review, the main approaches to automatic text summarization are described. We review the different processes for summarization and describe the effectiveness and shortcomings of the different methods.Comment: Some of references format have update

    ProKinO: An Ontology for Integrative Analysis of Protein Kinases in Cancer

    Get PDF
    Protein kinases are a large and diverse family of enzymes that are genomically altered in many human cancers. Targeted cancer genome sequencing efforts have unveiled the mutational profiles of protein kinase genes from many different cancer types. While mutational data on protein kinases is currently catalogued in various databases, integration of mutation data with other forms of data on protein kinases such as sequence, structure, function and pathway is necessary to identify and characterize key cancer causing mutations. Integrative analysis of protein kinase data, however, is a challenge because of the disparate nature of protein kinase data sources and data formats., where the mutations are spread over 82 distinct kinases. We also provide examples of how ontology-based data analysis can be used to generate testable hypotheses regarding cancer mutations.

    Snake-in-the-box codes for dimension 7

    No full text
    In the n-dimensional hypercube, an n-snake is a simple path with no chords, while an n-coil is a simple cycle without chords. There has been much interest in determining the length of a maximum n-snake and a maximum n-coil. Only upper and lower bounds for these maximum lengths are known for arbitrary n. Computationally, the problem of nding maximum n-snakes and n-coils su ers from combinatorial explosion, in that the size of the solution space which mustbesearched grows very rapidly as n increases. Previously, the maximum lengths of n-snakes and n-coils have been established only for n 7 and n 6, respectively. In this paper, we report on a coil searching computer program which established that 48 is the maximum length of a coil in the hypercube of dimension 7.
    • …
    corecore